Breast Specific Genes and Proteins

ABSTRACT

Human breast specific gene polypeptides and DNA (RNA) encoding such polypeptides and a procedure for producing such polypeptides by recombinant techniques is disclosed. Also disclosed are methods for utilizing such polynucleotides or polypeptides as a diagnostic marker for breast cancer and as an agent to determine if breast cancer has metastasized. Also disclosed are antibodies specific to the breast specific gene polypeptides which may be used to target cancer cells and be used as part of a breast cancer vaccine. Methods of screening for antagonists for the polypeptide and therapeutic uses thereof are also disclosed.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.10/267,849, filed Oct. 10, 2002, which is a continuation of U.S.application Ser. No. 08/673,284, filed Jun. 28, 1996; claims benefitunder 35 U.S.C. § 119(e) to U.S. Provisional Application No. 60/000,602,filed Jun. 30, 1995; each of which is hereby incorporated by referencein its entirety.

BACKGROUND OF THE INVENTION

This invention relates to newly identified polynucleotides, polypeptidesencoded by such polynucleotides, and the use of such polynucleotides andpolypeptides for detecting disorders of the breast, particularly thepresence of breast cancer and breast cancer metastases. The presentinvention further relates to inhibiting the production and function ofthe polypeptides of the present invention. The twenty breast specificgenes of the present invention are sometimes hereinafter referred to as“BSG1”, “BSG2” etc.

The mammary gland is subject to a variety of disorders that should bereadily detectable. Detection may be accomplished by inspection whichusually consists of palpation. Unfortunately, so few periodicself-examinations are made that many breast masses are discovered onlyby accidental palpation. Aspiration of suspected cysts with a fine-gaugeneedle is another fairly common diagnostic practice. Mammography orxeroradiography (soft-tissue x-ray) of the breast of yet another. Abiopsy of a lesion or suspected area is an extreme method of diagnostictest.

There are many types of tumors and cysts which affect the mammary gland.Fibroadenomas is the most common benign breast tumor. As a pathologicalentity, it ranks third behind cystic disease and carcinoma,respectively. These tumors are seen most frequently in young people andare usually readily recognized because they feel encapsulated.Fibrocystic disease, a benign condition, is the most common disease ofthe female breast, occurring in about 20% of pre-menopausal women.Lipomas of the breast are also common and they are benign in nature.Carcinoma of the breast is the most common malignant condition amongwomen and carries with it the highest fatality rate of all cancersaffecting this sex. At some during her life, one of every 15 women inthe USA will develop cancer of the breast. Its reported annual incidenceis 70 per 100,000 females in the population in 1947, rising to 72.5 in1969 for whites, and rising from 47.8 to 60.1 for blacks. The annualmortality rate from 1930 to the present has remained fairly constant, atapproximately 23 per 100,000 female population. Breast cancer is rare inmen, but when it does occur, it usually not recognized until late, andthus the results of treatment are poor. In women, carcinoma of thebreast is rarely seen before age 30 and the incidence rises rapidlyafter menopause. For this reason, post-menopausal breast masses shouldbe considered cancer until proved otherwise.

BRIEF SUMMARY OF THE INVENTION

In accordance with an aspect of the present invention, there areprovided nucleic acid probes comprising nucleic acid molecules ofsufficient length to specifically hybridize to the RNA transcribed fromthe human breast specific genes of the present invention or to DNAcorresponding to such RNA.

In accordance with another aspect of the present invention there isprovided a method of and products for diagnosing breast cancer formationand breast cancer metastases by detecting the presence of RNAtranscribed from the human breast specific genes of the presentinvention or DNA corresponding to such RNA in a sample derived from ahost.

In accordance with yet another aspect of the present invention, there isprovided a method of and products for diagnosing breast cancer formationand breast cancer metastases by detecting an altered level of apolypeptide corresponding to the breast specific genes of the presentinvention in a sample derived from a host, whereby an elevated level ofthe polypeptide indicates a breast cancer diagnosis.

In accordance with another aspect of the present invention, there areprovided isolated polynucleotides encoding human breast specificpolypeptides, including mRNAs, DNAs, cDNAs, genomic DNAs, as well asantisense analogs and biologically active and diagnostically ortherapeutically useful fragments thereof.

In accordance with still another aspect of the present invention thereare provided human breast specific genes which include polynucleotidesas set forth in the sequence listing.

In accordance with a further aspect of the present invention, there areprovided novel polypeptides encoded by the polynucleotides, as well asbiologically active and diagnostically or therapeutically usefulfragments, analogs and derivatives thereof.

In accordance with yet a further aspect of the present invention, thereis provided a process for producing such polypeptides by recombinanttechniques comprising culturing recombinant prokaryotic and/oreukaryotic host cells, containing a polynucleotide of the presentinvention, under conditions promoting expression of said proteins andsubsequent recovery of said proteins.

In accordance with yet a further aspect of the present invention, thereare provided antibodies specific to such polypeptides, which may beemployed to detect breast cancer cells or breast cancer metastasis.

In accordance with another aspect of the present invention, there areprovided processes for using one or more of the polypeptides of thepresent invention to treat breast cancer and for using the polypeptidesto screen for compounds which interact with the polypeptides, forexample, compounds which inhibit or activate the polypeptides of thepresent invention.

In accordance with yet another aspect of the present invention, there isprovided a screen for detecting compounds which inhibit activation ofone or more of the polynucleotides and/or polypeptides of the presentinvention which may be used to therapeutically, for example, in thetreatment of breast cancer.

In accordance with yet a further aspect of the present invention, thereare provided processes for utilizing such polypeptides, orpolynucleotides encoding such polypeptides, for in vitro purposesrelated to scientific research, synthesis of DNA and manufacture of DNAvectors.

These and other aspects of the present invention should be apparent tothose skilled in the art from the teachings herein.

BRIEF DESCRIPTION OF THE DRAWINGS

The following drawings are illustrative of embodiments of the inventionand are not meant to limit the scope of the invention as encompassed bythe claims.

FIG. 1 (SEQ ID NO: 1) is a full length cDNA sequence of breast specificgene 1 of the present invention.

FIG. 2 (SEQ ID NO:2) is a partial cDNA sequence and the correspondingdeduced amino acid sequence of breast specific gene 2 (HBGBP 46) of thepresent invention.

FIG. 3 (SEQ ID NO:4) is a partial cDNA sequence and deduced amino acidsequence of breast specific gene 3 of the invention.

FIG. 4 (SEQ ID NO:6) is a partial cDNA sequence and the correspondingdeduced amino acid sequence of breast specific gene 4 of the presentinvention.

FIG. 5 (SEQ ID NO:8) is a partial cDNA sequence of breast specific gene5 of the present invention.

FIG. 6 (SEQ ID NO:9) is a partial cDNA and deduced amino acid sequenceof breast specific gene 6 of the present invention.

FIG. 7 (SEQ ID NO: 11) is a partial cDNA sequence of breast specificgene 7 of the present invention.

FIG. 8 (SEQ ID NO: 12) is a partial cDNA sequence of breast specificgene 8 of the present invention.

FIG. 9 (SEQ ID NO: 13) is a partial cDNA sequence of breast specificgene 9 of the present invention.

FIG. 10 (SEQ ID NO: 14) is a partial cDNA sequence of breast specificgene 10 of the present invention.

FIG. 11 (SEQ ID NO: 15) is a partial cDNA sequence of breast specificgene 11 of the present invention.

FIG. 12 (SEQ ID NO:16) is a partial cDNA sequence of breast specificgene 12 of the present invention.

FIG. 13 (SEQ ID NO:17) is a partial cDNA sequence of breast specificgene 13 of the present invention.

FIG. 14 (SEQ ID NO: 18) is a partial cDNA sequence of breast specificgene 14 of the present invention.

FIG. 15 (SEQ ID NO: 19) is a partial cDNA sequence of breast specificgene 15 of the present invention.

FIG. 16 (SEQ ID NO:20) is a partial cDNA sequence of breast specificgene 16 of the present invention.

FIG. 17 (SEQ ID NO:21) is a partial cDNA sequence of breast specificgene 17 of the present invention.

FIG. 18 (SEQ ID NO:22) is a partial cDNA sequence of breast specificgene 18 of the present invention.

FIG. 19 (SEQ ID NO:23) is a partial cDNA sequence of breast specificgene 19 of the present invention.

FIG. 20 (SEQ ID NO:24) is a partial cDNA sequence of breast specificgene 20 of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

The term “breast specific gene” means that such gene is primarilyexpressed in tissues derived from the breast, and such genes may beexpressed in cells derived from tissues other than from the breast.However, the expression of such genes is significantly higher in tissuesderived from the breast than from non-breast tissues.

In accordance with one aspect of the present invention there is provideda polynucleotide which encodes a mature polypeptide encoded by thepolynucleotide having the sequence of FIG. 1 (SEQ ID NO: 1) andfragments, analogues and derivatives thereof.

In accordance with a further aspect of the present invention there isprovided a polynucleotide which encodes the same mature polypeptide as ahuman gene having a coding portion which contains a polynucleotide whichis at least 90% identical (preferably at least 95% identical and mostpreferably at least 97% or 100% identical) to one of the polynucleotidesof FIGS. 2-20 (SEQ ID NO:2, 4, 6, 8, 9, 11-24), as well as fragmentsthereof.

In accordance with still another aspect of the present invention thereis provided a polynucleotide which encodes for the same maturepolypeptide as a human gene whose coding portion includes apolynucleotide which is at least 90% identical to (preferably at least95% identical to and most preferably at least 97% or 100% identical) toone of the polynucleotides included in ATCC® 97175 of Jun.2, 1995.

In accordance with yet another aspect of the present invention, there isprovided a polynucleotide probe which hybridizes to mRNA (or thecorresponding cDNA) which is transcribed from the coding portion of ahuman gene which coding portion includes a DNA sequence which is atleast 90% identical to (preferably at least 95% identical to) and mostpreferably at least 97% or 100% identical) to one of the polynucleotidesequences of FIGS. 1-20 (SEQ ID NO:2, 4, 6, 8, 9, 11-24).

The present invention further relates to a mature polypeptide encoded bya coding portion of a human gene which coding portion includes a DNAsequence which is at least 90% identical to (preferably at least 95%identical to and more preferably 97% or 100% identical to) one of thepolynucleotides of FIG. 2-20 (SEQ ID NO:2, 4, 6, 8, 9, 11-24), andanalogues, derivatives and fragments thereof.

The present invention also relates to one of the mature polypeptidesencoded by the polynucleotide of FIG. 1 (SEQ ID NO: 1) and fragments,analogues and derivatives thereof

The present invention further relates to the same mature polypeptideencoded by a human gene whose coding portion includes DNA which is atleast 90% identical to (preferably at least 95% identical to and morepreferably at least 97% or 100% identical to) one of the polynucleotidesincluded in ATCC® Deposit No. 97175 deposited Jun. 2, 1995.

In accordance with an aspect of the present invention, there areprovided isolated nucleic acids (polynucleotides) which encode for themature polypeptides encoded by the polynucleotide of FIG. 1 (SEQ IDNO: 1) or fragments, analogues or derivatives thereof.

The polynucleotides of the present invention may be in the form of RNAor in the form of DNA, which DNA includes cDNA, genomic DNA, andsynthetic DNA. The DNA may be double-stranded or single-stranded, and ifsingle stranded may be the coding strand or non-coding (anti-sense)strand. The coding sequence which encodes the mature polypeptide mayinclude DNA identical to FIGS. 1-20 (SEQ ID NO: 1-2, 4, 6, 8-9, 10-24)or that of the deposited clone or may be a different coding sequencewhich coding sequence, as a result of the redundancy or degeneracy ofthe genetic code, encodes the same mature polypeptide as the codingsequence of a gene which coding sequence includes the DNA of FIGS. 1-20(SEQ ID NO: 1-2, 4, 6, 8-9, 10-24) or the deposited cDNA.

The polynucleotide which encodes a mature polypeptide of the presentinvention may include, but is not limited to: only the coding sequencefor the mature polypeptide; the coding sequence for the maturepolypeptide and additional coding sequence such as a leader or secretorysequence or a proprotein sequence; the coding sequence for the maturepolypeptide (and optionally additional coding sequence) and non-codingsequence, such as introns or non-coding sequence 5′ and/or 3′ of thecoding sequence for the mature polypeptide.

Thus, the term “polynucleotide encoding a polypeptide” encompasses apolynucleotide which includes only coding sequence for the polypeptideas well as a polynucleotide which includes additional coding and/ornon-coding sequence.

The present invention further relates to variants of the hereinabovedescribed polynucleotides which encode fragments, analogs andderivatives of a mature polypeptide of the present invention. Thevariant of the polynucleotide may be a naturally occurring allelicvariant of the polynucleotide or a non-naturally occurring variant ofthe polynucleotide.

Thus, the present invention includes polynucleotides encoding the samemature polypeptide as hereinabove described as well as variants of suchpolynucleotides which variants encode a fragment, derivative or analogof a polypeptide of the invention. Such nucleotide variants includedeletion variants, substitution variants and addition or insertionvariants.

The polynucleotides of the invention may have a coding sequence which isa naturally occurring allelic variant of the human gene whose codingsequence includes DNA as shown in FIGS. 1-20 (SEQ ID NO: 1-2, 4, 6, 8-9,10-24) or of the deposited clone. As known in the art, an allelicvariant is an alternate form of a polynucleotide sequence which may havea substitution, deletion or addition of one or more nucleotides, whichdoes not substantially alter the function of the encoded polypeptide.

The present invention also includes polynucleotides, wherein the codingsequence for the mature polypeptide may be fused in the same readingframe to a polynucleotide sequence which aids in expression andsecretion of a polypeptide from a host cell, for example, a leadersequence which functions as a secretory sequence for controllingtransport of a polypeptide from the cell. The polypeptide having aleader sequence is a preprotein and may have the leader sequence cleavedby the host cell to form the mature form of the polypeptide. Thepolynucleotides may also encode a proprotein which is the mature proteinplus additional 5′ amino acid residues. A mature protein having aprosequence is a proprotein and is an inactive form of the protein. Oncethe prosequence is cleaved an active mature protein remains.

Thus, for example, the polynucleotide of the present invention mayencode a mature protein, or a protein having a prosequence or a proteinhaving both a presequence and a presequence (leader sequence).

The polynucleotides of the present invention may also have the codingsequence fused in frame to a marker sequence which allows forpurification of the polypeptide of the present invention. The markersequence may be a hexa-histidine tag supplied by a pQE-9 vector toprovide for purification of the mature polypeptide fused to the markerin the case of a bacterial host, or, for example, the marker sequencemay be a hemagglutinin (HA) tag when a mammalian host, e.g. COS-7 cells,is used. The HA tag corresponds to an epitope derived from the influenzahemagglutinin protein (Wilson, I., et al., Cell, 37:767 (1984)).

The present invention further relates to polynucleotides which hybridizeto the hereinabove-described polynucleotides if there is at least 70%,preferably at least 90%, and more preferably at least 95% identitybetween the sequences. The present invention particularly relates topolynucleotides which hybridize under stringent conditions to thehereinabove-described polynucleotides. As herein used, the term“stringent conditions” means hybridization will occur only if there isat least 95% and preferably at least 97% identity between the sequences.The polynucleotides which hybridize to the hereinabove describedpolynucleotides in a preferred embodiment encode polypeptides whichretain substantially the same biological function or activity as themature polypeptide of the present invention encoded by a coding sequencewhich includes the DNA of FIGS. 1-20 (SEQ ID NO: 1-2, 4, 6, 8-9, 10-24)or the deposited cDNA(s).

Alternatively, the polynucleotide may have at least 10 or 20 bases,preferably at least 30 bases, and more preferably at least 50 baseswhich hybridize to a polynucleotide of the present invention and whichhas an identity thereto, as hereinabove described, and which may or maynot retain activity. For example, such polynucleotides may be employedas probes for polynucleotides, for example, for recovery of thepolynucleotide or as a diagnostic probe or as a PCR primer.

Thus, the present invention is directed to polynucleotides having atleast a 70% identity, preferably at least 90% and more preferably atleast 95% identity to a polynucleotide which encodes the maturepolypeptide encoded by a human gene which includes the DNA of one ofFIGS. 1-20 (SEQ ID NO: 1-2, 4, 6, 8-9, 10-24) as well as fragmentsthereof, which fragments have at least 30 bases and preferably at least50 bases and to polypeptides encoded by such polynucleotides.

The partial sequences are specific tags for messenger RNA molecules. Thecomplete sequence of that messenger RNA, in the form of cDNA, isdetermined using the partial sequence as a probe to identify a cDNAclone corresponding to a full-length transcript. The partial cDNA clonecan also be used as a probe to identify a genomic clone or clones thatcontain the complete gene including regulatory and promoter regions,exons, and introns.

The partial sequences of FIGS. 2-20 (SEQ ID NO:2, 4, 6, 8, 9, 11-24) maybe used to identify the corresponding full length gene from which theywere derived. The partial sequences can be nickel-translated orend-labelled with ³²P using polynucleotide Idnase using labellingmethods known to those with skill in the art (Basic Methods in MolecularBiology, L. G. Davis, M. D. Dibner, and J. F. Battey, ed., ElsevierPress, N.Y., 1986). A lambda library prepared from human breast tissuecan be directly screened with the labelled sequences of interest or thelibrary can be converted en masse to pBluescript (Stratagene CloningSystems, La Jolla, Calif. 92037) to facilitate bacterial breastyscreening. Regarding pBluescript, see Sambrook et al., MolecularCloning-A Laboratory Manual, Cold Spring Harbor Laboratory Press (1989),pg. 1.20. Both methods are well known in the art. Briefly, filters withbacterial colonies containing the library in pBluescript or bacteriallawns containing lambda plaques are denatured and the DNA is fixed tothe filters. The filters are hybridized with the labelled probe usinghybridization conditions described by Davis et al., supra. The partialsequences, cloned into lambda or pBluescript, can be used as positivecontrols to assess background binding and to adjust the hybridizationand washing stringencies necessary for accurate clone identification.The resulting autoradiograms are compared to duplicate plates ofcolonies or plaques; each exposed spot corresponds to a positive breastyor plaque. The colonies or plaques are selected, expanded and the DNA isisolated from the colonies for further analysis and sequencing.

Positive cDNA clones are analyzed to determine the amount of additionalsequence they contain using PCR with one primer from the partialsequence and the other primer from the vector. Clones with a largervector-insert PCR product than the original partial sequence areanalyzed by restriction digestion and DNA sequencing to determinewhether they contain an insert of the same size or similar as the mRNAsize determined from Northern blot Analysis.

Once one or more overlapping cDNA clones are identified, the completesequence of the clones can be determined. The preferred method is to useexonuclease III digestion (McCombie, W. R, Kirlcness, E., Fleming, J.T., Kerlavage, A. R., Iovannisci, D. M., and Martin-Gallardo, R.,Methods, 3:33-40, 1991). A series of deletion clones are generated, eachof which is sequenced. The resulting overlapping sequences are assembledinto a single contiguous sequence of high redundancy (usually three tofive overlapping sequences at each nucleotide position), resulting in ahighly accurate final sequence.

The DNA sequences (as well as the corresponding RNA sequences) alsoinclude sequences which are or contain a DNA sequence identical to onecontained in and isolatable from ATCC® Deposit No. 97175, deposited Jun.2, 1995, and fragments or portions of the isolated DNA sequences (andcorresponding RNA sequences), as well as DNA (RNA) sequences encodingthe same polypeptide. In particular, DNA (RNA) sequences encoding BSG1(SEQ ID NO:32) and the amino acid sequence encoded thereby (SEQ IDNO:33) are contained in and isolatable from ATCC® Deposit No. 97175using routine techniques known in the art. A cDNA clone, HBGBP46,encoding BSG2 was deposited as ATCC® Deposit No. PTA-1545 on Mar. 22,2000, at the American Type Culture Collection, Patent Depository, 10801University Boulevard, Manassas, Va. 20110-2209.

The deposit(s) referred to herein will be maintained under the terms ofthe Budapest Treaty on the International Recognition of the Deposit ofMicro-organisms for purposes of Patent Procedure. These deposits areprovided merely as convenience to those of skill in the art and are notan admission that a deposit is required under 35 U.S.C. § 112. Thesequence of the polynucleotides contained in the deposited materials, aswell as the amino acid sequence of the polypeptides encoded thereby, areincorporated herein by reference and are controlling in the event of anyconflict with any description of sequences herein. A license may berequired to make, use or sell the deposited materials, and no suchlicense is hereby granted.

The present invention further relates to polynucleotides which have atleast 10 bases, preferably at least 20 bases, and may have 30 or morebases, which polynucleotides are hybridizable to and have at least a 70%identity to RNA (and DNA which corresponds to such RNA) transcribed froma human gene whose coding portion includes DNA as hereinabove described.

Thus, the polynucleotide sequences which hybridize as described abovemay be used to hybridize to and detect the expression of the human genesto which they correspond for use in diagnostic assays as hereinafterdescribed.

In accordance with still another aspect of the present invention thereare provided diagnostic assays for detecting micrometastases of breastcancer in a host. While applicant does not wish to limit the reasoningof the present invention to any specific scientific theory, it isbelieved that the presence of active transcription of a breast specificgene of the present invention in cells of the host, other than thosederived from the breast, is indicative of breast cancer metastases. Thisis true because, while the breast specific genes are found in all cellsof the body, their transcription to mRNA, cDNA and expression productsis primarily limited to the breast in non-diseased individuals. However,if breast cancer is present, breast cancer cells migrate from the cancerto other cells, such that these other cells are now activelytranscribing and expressing a breast specific gene at a greater levelthan is normally found in non-diseased individuals, i.e., transcriptionis higher than found in non-breast tissues in healthy individuals. It isthe detection of this enhanced transcription or enhanced proteinexpression in cells, other than those derived from the breast, which isindicative of metastases of breast cancer.

In one example of such a diagnostic assay, an RNA sequence in a samplederived from a tissue other than the breast is detected by hybridizationto a probe. The sample contains a nucleic acid or a mixture of nucleicacids, at least one of which is suspected of containing a human breastspecific gene or fragment thereof of the present invention which istranscribed and expressed in such tissue. Thus, for example, in a formof an assay for determining the presence of a specific RNA in cells,initially RNA is isolated from the cells.

A sample may be obtained from cells derived from tissue other than fromthe breast including but not limited to blood, urine, saliva, tissuebiopsy and autopsy material. The use of such methods for detectingenhanced transcription to mRNA from a human breast specific gene of thepresent invention or fragment thereof in a sample obtained from cellsderived from other than the breast is well within the scope of thoseskilled in the art from the teachings herein.

The isolation of mRNA comprises isolating total cellular RNA bydisrupting a cell and performing differential centrifugation. Once thetotal RNA is isolated, mRNA is isolated by making use of the adeninenucleotide residues known to those skilled in the art as a poly(A) tailfound on virtually every eukaryotic mRNA molecule at the 3′ end thereof.Oligonucleotides composed of only deoxythymidine [oligo(dT)] are linkedto cellulose and the oligo (dT)-cellulose packed into small columns.When a preparation of total cellular RNA is passed through such acolumn, the mRNA molecules bind to the oligo(dT) by the poly(A)tailswhile the rest of the RNA flows through the column. The bound mRNAs arethen eluted from the column and collected.

One example of detecting isolated mRNA transcribed from a breastspecific gene of the present invention comprises screening the collectedmRNAs with the gene specific oligonucleotide probes, as hereinabovedescribed.

It is also appreciated that such probes can be and are preferablylabeled with an analytically detectable reagent to facilitateidentification of the probe. Useful reagents include but are not limitedto radioactivity, fluorescent dyes or enzymes capable of catalyzing theformation of a detectable product.

An example of detecting a polynucleotide complementary to the mRNAsequence (cDNA) utilizes the polymerase chain reaction (PCR) inconjunction with reverse transcriptase. PCR is a very powerful methodfor the specific amplification of DNA or RNA stretches (Saiki et al.,Nature, 234:163-166 (1986)). One application of this technology is innucleic acid probe technology to bring up nucleic acid sequences presentin low copy numbers to a detectable level. Numerous diagnostic andscientific applications of this method have been described by H. A.Erlich (ed.) in PCR Technology-Principles and Applications for DNAAmplification, Stockton Press, USA, 1989, and by M. A. Inis (ed.) in PCRProtocols, Academic Press, San Diego, USA, 1990.

RT-PCR is a combination of PCR with the reverse transcriptase enzyme.Reverse transcriptase is an enzyme which produces cDNA molecules fromcorresponding mRNA molecules. This is important since PCR amplifiesnucleic acid molecules, particularly DNA, and this DNA may be producedfrom the mRNA isolated from a sample derived from the host.

A specific example of an RT-PCR diagnostic assay involves removing asample from a tissue of a host. Such a sample will be from a tissue,other than the breast, for example, blood. Therefore, an example of sucha diagnostic assay comprises whole blood gradient isolation of nucleatedcells, total RNA extraction, RT-PCR of total RNA and agarose gelelectrophoresis of PCR products. The PCR products comprise cDNAcomplementary to RNA transcribed from one or more breast specific genesof the present invention or fragments thereof. More particularly, ablood sample is obtained and the whole blood is combined with an equalvolume of phosphate buffered saline, centrifuged and the lymphocyte andgranulocyte layer is carefully aspirated and rediluted in phosphatebuffered saline and centrifuged again. The supernate is discarded andthe pellet containing nucleated cells is used for RNA extraction usingthe RNazole B method as described by the manufacturer (Tel-Test Inc.,Friendswood, Tex).

Oligonucleotide primers and probes are prepared with high specificity tothe DNA sequences of the present invention. The probes are at least 10base pairs in length, preferably at least 30 base pairs in length andmost preferably at least 50 base pairs in length or more. The reversetranscriptase reaction and PCR amplification are performed sequentiallywithout interruption. Taq polymerase is used during PCR and the PCRproducts are concentrated and the entire sample is run on aTris-borate-EDTA agarose gel containing ethidium bromide.

In accordance with another aspect of the present invention, there isprovided a method of diagnosing a disorder of the breast, for examplebreast cancer, by determining altered levels of the breast specificpolypeptides of the present invention in a biological sample, derivedfrom tissue other than from the breast. Elevated levels of the breastspecific polypeptides of the present invention, indicates activetranscription and expression of the corresponding breast specific geneproduct. Assays used to detect levels of a breast specific genepolypeptide in a sample derived from a host are well-known to thoseskilled in the art and include radioimmunoassays, competitive-bindingassays, Western blot analysis, ELISA assays and “sandwich” assays. Abiological sample may include, but is not limited to, tissue extracts,cell samples or biological fluids, however, in accordance with thepresent invention, a biological sample specifically does not includetissue or cells of the breast.

An ELISA assay (Coligan, et al., Current Protocols in Immunology, 1(2),Chapter 6, 1991) initially comprises preparing an antibody specific to abreast specific polypeptide of the present invention, preferably amonoclonal antibody. In addition, a reporter antibody is preparedagainst the monoclonal antibody. To the reporter antibody is attached adetectable reagent such as radioactivity, fluorescence or, in thisexample, a horseradish peroxidase enzyme. A sample is removed from ahost and incubated on a solid support, e.g., a polystyrene dish, thatbinds the proteins in the sample. Any free protein binding sites on thedish are then covered by incubating with a non-specific protein, such asBSA. Next, the monoclonal antibody is incubated in the dish during whichtime the monoclonal antibodies attach to the breast specific polypeptideattached to the polystyrene dish. All unbound monoclonal antibody iswashed out with buffer. The reporter antibody linked to horseradishperoxidase is now placed in the dish resulting in binding of thereporter antibody to any monoclonal antibody bound to the breastspecific gene polypeptide. Unattached reporter antibody is then washedout. Peroxidase substrates are then added to the dish and the amount ofcolor developed in a given time period is a measurement of the amount ofthe breast specific polypeptide present in a given volume of patientsample when compared against a standard curve.

A competition assay may be employed where antibodies specific to abreast specific polypeptide are attached to a solid support. The breastspecific polypeptide is then labeled and the labeled polypeptide asample derived from the host are passed over the solid support and theamount of label detected, for example, by liquid scintillationchromatography, can be correlated to a quantity of the breast specificpolypeptide in the sample.

A “sandwich” assay is similar to an ELISA assay. In a “sandwich” assay,breast specific polypeptides are passed over a solid support and bind toantibody attached to the solid support. A second antibody is then boundto the breast specific polypeptide. A third antibody which is labeledand is specific to the second antibody, is then passed over the solidsupport and binds to the second antibody and an amount can then bequantified.

In alternative methods, labeled antibodies to a breast specificpolypeptide are used. In a one-step assay, the target molecule, if it ispresent, is immobilized and incubated with a labeled antibody. Thelabeled antibody binds to the immobilized target molecule. After washingto remove the unbound molecules, the sample is assayed for the presenceof the label. In a two-step assay, immobilized target molecule isincubated with an unlabeled antibody. The target molecule-labeledantibody complex, if present, is then bound to a second, labeledantibody that is specific for the unlabeled antibody. The sample iswashed and assayed for the presence of the label.

Such antibodies specific to breast specific gene proteins, for example,anti-idiotypic antibodies, can be used to detect breast cancer cells bybeing labeled and described above and binding tightly to the breastcancer cells, and, therefore, detect their presence.

The antibodies may also be used to target breast cancer cells, forexample, in a method of homing interaction agents which, when contactingbreast cancer cells, destroy them. This is true since the antibodies arespecific for breast specific genes which are primarily expressed inbreast cancer, and a linking of the interaction agent to the antibodywould cause the interaction agent to be carried directly to the breast.

Antibodies of this type may also be used to do in vivo imaging, forexample, by labeling the antibodies to facilitate scanning of thebreast. One method for imaging comprises contacting any cancer cells ofthe breast to be imaged with an anti-breast specific gene proteinantibody labeled with a detectable marker. The method is performed underconditions such that the labeled antibody binds to the breast specificgene proteins. In a specific example, the antibodies interact with thebreast, for example, breast cancer cells, and fluoresce upon suchcontact such that imaging and visibility of the breast is enhanced toallow a determination of the diseased or non-diseased state of thebreast.

The choice of marker used to label the antibodies will vary dependingupon the application. However, the choice of marker is readilydeterminable to one skilled in the art. These labeled antibodies may beused in immunoassays as well as in histological applications to detectthe presence of the proteins. The labeled antibodies may be polyclonalor monoclonal.

The presence of active transcription, which is greater than thatnormally found, of the breast specific genes in cells other than fromthe breast, by the presence of an altered level of mRNA, cDNA orexpression products is an important indication of the presence of abreast cancer which has metastasized, since breast cancer cells aremigrating from the breast into the general circulation. Accordingly,this phenomenon may have important clinical implications since themethod of treating a localized, as opposed to a metastasized, tumor isentirely different.

Of the 20 breast specific genes disclosed, only breast specific gene 1is a full-length gene. Breast specific gene 1 is 79% identical and 83%similar to human Alzheimer disease amyloid gene. Breast specific gene 2is 30% identical and 48% similar to humanhydroxyindole-o-methyltransferase gene. Breast specific gene 3 is 58%identical and 62% similar to human 06-methylguanine-DNAmethyltransferase gene. Breast specific gene 4 is 34% identical and 65%similar to the mouse p120 gene. Breast specific gene 5 is 78% identicaland 89% similar to human p70 ribosomal S6 kinase alpha-II gene. Breastspecific gene 6 is 77% identical and 79% similar to the humantranscription factor NFATp gene.

As stated previously, the breast specific genes of the present inventionare putative molecular markers in the diagnosis of breast cancerformation, and breast cancer metastases. As shown in the following Table1, the presence of the breast specific genes when tested in normalbreast, breast cancer, embryo and other cancer libraries, the breastspecific genes of the present invention were found to be most prevalentin the breast cancer library, indicating that the genes of the presentinvention may be employed for detecting breast cancer, as discussedpreviously. The table also indicates a putative identification, based onhomology, of BSG1 through BSG6 to known genes. TABLE 1 Homolog Gene NameNorm Br Other Genes (Class) Br Ca Embryo Cancers Others BSG1 AD Amyloid(3) 1 6 1 BSG2 Hydroxyindole- 3 1 1 o-methytransferase (2) BSG3O-6-methylguanine- 3 1 1 DNA methyltransferase (1) BSG4 P120 (3) 3 1BSG5 p70 ribosomal S6 3 1 kinase alpha-II (2) BSG6 Transcription factor2 NFATp(3) BSG7 2 1 BSG8 4 3 1 BSG8 2 BSG9 3 BSG10 3 BSG11 3 BSG12 3 3BSG13 3 BSG14 2 BSG15 3 BSG16 1 1 1 BSG17 2 1 BSG18 2 BSG19 1 1 BSG20 2

The assays described above may also be used to test whether bone marrowpreserved before chemotherapy is contaminated with micrometastases of abreast cancer cell. In the assay, blood cells from the bone marrow areisolated and treated as described above, this method allows one todetermine whether preserved bone marrow is still suitable fortransplantation after chemotherapy.

The present invention further relates to mature polypeptides, forexample the BSG1 polypeptide, as well as fragments, analogs andderivatives of such polypeptide.

The terms “fragment,” “derivative” and “analog” when referring to thepolypeptides encoded by the genes of the invention means a polypeptidewhich retains essentially the same biological function or activity assuch polypeptide. Thus, an analog includes a proprotein which can beactivated by cleavage of the proprotein portion to produce an activemature polypeptide.

The polypeptides of the present invention may be recombinantpolypeptides, natural polypeptides or synthetic polypeptides, preferablyrecombinant polypeptides.

The fragment, derivative or analog of the polypeptides encoded by thegenes of the invention may be (i) one in which one or more of the aminoacid residues are substituted with a conserved or non-conserved aminoacid residue (preferably a conserved amino acid residue) and suchsubstituted amino acid residue may or may not be one encoded by thegenetic code, or (ii) one in which one or more of the amino acidresidues includes a substituent group, or (iii) one in which thepolypeptide is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or (iv) one in which the additional amino acids are fused tothe polypeptide, such as a leader or secretory sequence or a sequencewhich is employed for purification of the mature polypeptide or aproprotein sequence. Such fragments, derivatives and analogs are deemedto be within the scope of those skilled in the art from the teachingsherein.

The polypeptides and polynucleotides of the present invention arepreferably provided in an isolated form, and preferably are purified tohomogeneity.

The term “isolated” means that the material is removed from its originalenvironment (e.g., the natural environment if it is naturallyoccurring). For example, a naturally-occurring polynucleotide orpolypeptide present in a living animal is not isolated, but the samepolynucleotide or polypeptide, separated from some or all of thecoexisting materials in the natural system, is isolated. Suchpolynucleotides may be part of a vector.

The polypeptides of the present invention include the polypeptidesencoded by the polynucleotide of FIG. 1 (SEQ ID NO: 1) (in particularthe mature polypeptides) as well as polypeptides which have at least 70%similarity (preferably at least a 70% identity) to the polypeptidesencoded by the polynucleotide of FIG. 1 (SEQ ID NO: 1) and preferably atleast a 90% similarity (preferably at least a 90% identity) to thepolypeptides of FIGS. 8 and 9 (SEQ ID NO: 12 and 13) and more preferablyat least a 95% similarity (still more preferably at least 95% identity)to the polypeptides encoded by the polynucleotide of (SEQ ID NO: 1) andalso include portions of such polypeptides with such portion of thepolypeptide generally containing at least 30 amino acids and morepreferably at least 50 amino acids.

As known in the art “similarity” between two polypeptides is determinedby comparing the amino acid sequence and its conserved amino acidsubstitutes of one polypeptide to the sequence of a second polypeptide.

Fragments or portions of the polypeptides of the present invention maybe employed for producing the corresponding full-length polypeptide bypeptide synthesis; therefore, the fragments may be employed asintermediates for producing the full-length polypeptides. Fragments orportions of the polynucleotides of the present invention may be used tosynthesize full-length polynucleotides of the present invention.

The present invention also relates to vectors which includepolynucleotides of the present invention, host cells which aregenetically engineered with vectors of the invention and the productionof polypeptides of the invention by recombinant techniques.

Host cells are genetically engineered (transduced or transformed ortransfected) with the vectors of this invention which may be, forexample, a cloning vector or an expression vector. The vector may be,for example, in the form of a plasmid, a viral particle, a phage, etc.The engineered host cells can be cultured in conventional nutrient mediamodified as appropriate for activating promoters, selectingtransformants or amplifying the breast specific genes. The cultureconditions, such as temperature, pH and the like, are those previouslyused with the host cell selected for expression, and will be apparent tothose of ordinarily skill in the art.

The polynucleotides of the present invention may be employed forproducing polypeptides by recombinant techniques. Thus, for example, thepolynucleotide may be included in any one of a variety of expressionvectors for expressing a polypeptide. Such vectors include chromosomal,nonchromosomal and synthetic DNA sequences, e.g., derivatives of SV40;bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectorsderived from combinations of plasmids and phage DNA, viral DNA such asvaccinia, adenovirus, fowl pox virus, and pseudorabies. However, anyother vector may be used as long as it is replicable and viable in thehost.

The appropriate DNA sequence may be inserted into the vector by avariety of procedures. In general, the DNA sequence is inserted into anappropriate restriction endonuclease site(s) by procedures known in theart. Such procedures and others are deemed to be within the scope ofthose skilled in the art.

The DNA sequence in the expression vector is operatively linked to anappropriate expression control sequence(s) (promoter) to direct mRNAsynthesis. As representative examples of such promoters, there may bementioned: LTR or SV40 promoter, the E. coli. lac or trp, the phagelambda P_(L) promoter and other promoters known to control expression ofgenes in prokaryotic or eukaryotic cells or their viruses. Theexpression vector also contains a ribosome binding site for translationinitiation and a transcription terminator. The vector may also includeappropriate sequences for amplifying expression.

In addition, the expression vectors preferably contain one or moreselectable marker genes to provide a phenotypic trait for selection oftransformed host cells such as dihydrofolate reductase or neomycinresistance for eukaryotic cell culture, or such as tetracycline orampicillin resistance in E. coli.

The vector containing the appropriate DNA sequence as hereinabovedescribed, as well as an appropriate promoter or control sequence, maybe employed to transform an appropriate host to permit the host toexpress the protein.

As representative examples of appropriate hosts, there may be mentioned:bacterial cells, such as E. coli, Streptomyces, Salmonella typhimurium;fungal cells, such as yeast; insect cells such as Drosophila S2 andSpodoptera Sf9; animal cells such as CHO, COS or Bowes melanoma;adenoviruses; plant cells, etc. The selection of an appropriate host isdeemed to be within the scope of those skilled in the art from theteachings herein.

More particularly, the present invention also includes recombinantconstructs comprising one or more of the sequences as broadly describedabove. The constructs comprise a vector, such as a plasmid or viralvector, into which a sequence of the invention has been inserted, in aforward or reverse orientation. In a preferred aspect of thisembodiment, the construct further comprises regulatory sequences,including, for example, a promoter, operably linked to the sequence.Large numbers of suitable vectors and promoters are known to those ofskill in the art, and are commercially available. The following vectorsare provided by way of example. Bacterial: pQE70, pQE60, pQE-9 (Qiagen),pBS, pD10, phagescript, psiX174, pbluescript SK, pBSKS, pNH8A, pNH16a,pNH18A, pNH46A (Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5(Pharmacia). Eukaryotic: pWLNEO, pSV2CAT, pOG44, pXT1, pSG (Stratagene)pSVK3, pBPV, pMSG, pSVL (Pharmacia). However, any other plasmid orvector may be used as long as they are replicable and viable in thehost.

Promoter regions can be selected from any desired gene using CAT(chloramphenicol transferase) vectors or other vectors with selectablemarkers. Two appropriate vectors are pKK232-8 and pCM7. Particular namedbacterial promoters include lacI, lacZ, T3, T7, gpt, lambda P_(R), P_(L)and trp. Eukaryotic promoters include CMV immediate early, HSV thymidinekinase, early and late SV40, LTRs from retrovirus, and mousemetallothionein-I. Selection of the appropriate vector and promoter iswell within the level of ordinary skill in the art.

In a further embodiment, the present invention relates to host cellscontaining the above-described constructs. The host cell can be a highereukaryotic cell, such as a mammalian cell, or a lower eukaryotic cell,such as a yeast cell, or the host cell can be a prokaryotic cell, suchas a bacterial cell. Introduction of the construct into the host cellcan be effected by calcium phosphate transfection, DEAE-Dextran mediatedtransfection, or electroporation (Davis, L., Dibner, M., Battey, I.,Basic Methods in Molecular Biology, (1986)).

The constructs in host cells can be used in a conventional manner toproduce the gene product encoded by the recombinant sequence.Alternatively, the polypeptides of the invention can be syntheticallyproduced by conventional peptide synthesizers.

Proteins can be expressed in mammalian cells, yeast, bacteria, or othercells under the control of appropriate promoters. Cell-free translationsystems can also be employed to produce such proteins using RNAs derivedfrom the DNA constructs of the present invention. Appropriate cloningand expression vectors for use with prokaryotic and eukaryotic hosts aredescribed by Sambrook, et al., Molecular Cloning: A Laboratory Manual,Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure ofwhich is hereby incorporated by reference.

Transcription of the DNA encoding the polypeptides of the presentinvention by higher eukaryotes is increased by inserting an enhancersequence into the vector. Enhancers are cis-acting elements of DNA,usually about from 10 to 300 bp that act on a promoter to increase itstranscription. Examples including the SV40 enhancer on the late side ofthe replication origin bp 100 to 270, a cytomegalovirus early promoterenhancer, the polyoma enhancer on the late side of the replicationorigin, and adenovirus enhancers.

Generally, recombinant expression vectors will include origins ofreplication and selectable markers permitting transformation of the hostcell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiaeTRP1 gene, and a promoter derived from a highly-expressed gene to directtranscription of a downstream structural sequence. Such promoters can bederived from operons encoding glycolytic enzymes such as3-phosphoglycerate kinase (PGK), -factor, acid phosphatase, or heatshock proteins, among others. The heterologous structural sequence isassembled in appropriate phase with translation initiation andtermination sequences. Optionally, the heterologous sequence can encodea fusion protein including an N-terminal identification peptideimparting desired characteristics, e.g., stabilization or simplifiedpurification of expressed recombinant product.

Useful expression vectors for bacterial use are constructed by insertinga structural DNA sequence encoding a desired protein together withsuitable translation initiation and termination signals in operablereading frame with a functional promoter. The vector will comprise oneor more phenotypic selectable markers and an origin of replication toensure maintenance of the vector and to, if desirable, provideamplification within the host. Suitable prokaryotic hosts fortransformation include E. coli, Bacillus subtilis, Salmonellatyphimurium and various species within the genera Pseudomonas,Streptomyces, and Staphylococcus, although others may also be employedas a matter of choice.

As a representative but nonlimiting example, useful expression vectorsfor bacterial use can comprise a selectable marker and bacterial originof replication derived from commercially available plasmids comprisinggenetic elements of the well known cloning vector pBR322 (ATCC 37017).Such commercial vectors include, for example, pKK223-3 (Pharmacia FineChemicals, Uppsala, Sweden) and GEM1 (Promega Biotec, Madison, Wis.,USA). These pBR322 “backbone” sections are combined with an appropriatepromoter and the structural sequence to be expressed.

Following transformation of a suitable host strain and growth of thehost strain to an appropriate cell density, the selected promoter isinduced by appropriate means (e.g., temperature shift or chemicalinduction) and cells are cultured for an additional period.

Cells are typically harvested by centrifugation, disrupted by physicalor chemical means, and the resulting crude extract retained for furtherpurification.

Microbial cells employed in expression of proteins can be disrupted byany convenient method, including freeze-thaw cycling, sonication,mechanical disruption, or use of cell lysing agents, such methods arewell know to those skilled in the art.

Various mammalian cell culture systems can also be employed to expressrecombinant protein. Examples of mammalian expression systems includethe COS-7 lines of monkey kidney fibroblasts, described by Gluzman,Cell, 23:175 (1981), and other cell lines capable of expressing acompatible vector, for example, the C127, 3T3, CHO, HeLa and BHK celllines. Mammalian expression vectors will comprise an origin ofreplication, a suitable promoter and enhancer, and also any necessaryribosome binding sites, polyadenylation site, splice donor and acceptorsites, transcriptional termination sequences, and 5′ flankingnontranscribed sequences. DNA sequences derived from the SV40 splice,and polyadenylation sites may be used to provide the requirednontranscribed genetic elements.

The breast specific gene polypeptides can be recovered and purified fromrecombinant cell cultures by methods including ammonium sulfate orethanol precipitation, acid extraction, anion or cation exchangechromatography, phosphocellulose chromatography, hydrophobic interactionchromatography, affinity chromatography, hydroxylapatite chromatographyand lectin chromatography. Protein refolding steps can be used, asnecessary, in completing configuration of the mature protein. Finally,high performance liquid chromatography (HPLC) can be employed for finalpurification steps.

The polynucleotides of the present invention may have the codingsequence fused in frame to a marker sequence which allows forpurification of the polypeptide of the present invention. An example ofa marker sequence is a hexahistidine tag which may be supplied by avector, preferably a pQE-9 vector, which provides for purification ofthe polypeptide fused to the marker in the case of a bacterial host, or,for example, the marker sequence may be a hemagglutinin (HA) tag when amammalian host, e.g. COS-7 cells, is used. The HA tag corresponds to anepitope derived from the influenza hemagglutinin protein (Wilson, I., etal., Cell, 37:767 (1984)).

The polypeptides of the present invention may be a naturally purifiedproduct, or a product of chemical synthetic procedures, or produced byrecombinant techniques from a prokaryotic or eukaryotic host (forexample, by bacterial, yeast, higher plant, insect and mammalian cellsin culture). Depending upon the host employed in a recombinantproduction procedure, the polypeptides of the present invention may beglycosylated or may be non-glycosylated. Polypeptides of the inventionmay also include an initial methionine amino acid residue.

BSG1, and other breast specific genes, and the protein product thereofmay be employed for early detection of breast cancer since they areover-expressed in the breast cancer state.

In accordance with another aspect of the present invention there areprovided assays which may be used to screen for therapeutics to inhibitthe action of the breast specific genes or breast specific proteins ofthe present invention. The present invention discloses methods forselecting a therapeutic which forms a complex with breast specific geneproteins with sufficient affinity to prevent their biological action.The methods include various assays, including competitive assays wherethe proteins are immobilized to a support, and are contacted with anatural substrate and a labeled therapeutic either simultaneously or ineither consecutive order, and determining whether the therapeuticeffectively competes with the natural substrate in a manner sufficientto prevent binding of the protein to its substrate.

In another embodiment, the substrate is immobilized to a support, and iscontacted with both a labeled breast specific polypeptide and atherapeutic (or unlabeled proteins and a labeled therapeutic), and it isdetermined whether the amount of the breast specific polypeptide boundto the substrate is reduced in comparison to the assay without thetherapeutic added. The breast specific polypeptide may be labeled withantibodies.

Potential therapeutic compounds include antibodies and anti-idiotypicantibodies as described above, or in some cases, an oligonucleotide,which binds to the polypeptide.

Another example is an antisense construct prepared using antisensetechnology, which is directed to a breast specific polynucleotide toprevent transcription. Antisense technology can be used to control geneexpression through triple-helix formation or antisense DNA or RNA, bothof which methods are based on binding of a polynucleotide to DNA or RNA.For example, the 5′ coding portion of the polynucleotide sequence, whichencodes for the mature polypeptides of the present invention, is used todesign an antisense RNA oligonucleotide of from about 10 to 40 basepairs in length. A DNA oligonucleotide is designed to be complementaryto a region of the gene involved in transcription (triple helix—see Leeet al., Nucl. Acids Res., 6:3073 (1979); Cooney et al, Science, 241:456(1988); and Dervan et al., Science, 251: 1360 (1991)), therebypreventing transcription and the production of abreast specificpolynucleotide. The antisense RNA oligonucleotide hybridizes to the mRNAin vivo and blocks translation of the mRNA molecule into the breastspecific genes polypeptide (antisense—Okano, J. Neurochem., 56:560(1991); Oligodeoxynucleotides as Antisense Inhibitors of GeneExpression, CRC Press, Boca Raton, Fla. (1988)). The oligonucleotidesdescribed above can also be delivered to cells such that the antisenseRNA or DNA may be expressed in vivo to inhibit production of the breastspecific polypeptides.

Another example is a small molecule which binds to and occupies theactive site of the breast specific polypeptide thereby making the activesite inaccessible to substrate such that normal biological activity isprevented. Examples of small molecules include but are not limited tosmall peptides or peptide-like molecules.

These compounds may be employed to treat breast cancer, since theyinteract with the function of breast specific polypeptides in a mannersufficient to inhibit natural function which is necessary for theviability of breast cancer cells. This is true since the BSGs and theirprotein products are primarily expressed in breast cancer tissues andare, therefore, suspected of being critical to the formation of thisstate.

The compounds may be employed in a composition with a pharmaceuticallyacceptable carrier, e.g., as hereinafter described.

The compounds of the present invention may be employed in combinationwith a suitable pharmaceutical carrier. Such compositions comprise atherapeutically effective amount of the polypeptide, and apharmaceutically acceptable carrier or excipient. Such a carrierincludes but is not limited to saline, buffered saline, dextrose, water,glycerol, ethanol, and combinations thereof. The formulation should suitthe mode of administration.

The invention also provides a pharmaceutical pack or kit comprising oneor more containers filled with one or more of the ingredients of thepharmaceutical compositions of the invention. Associated with suchcontainer(s) can be a notice in the form prescribed by a governmentalagency regulating the manufacture, use or sale of pharmaceuticals orbiological products, which notice reflects approval by the agency ofmanufacture, use or sale for human administration. In addition, thepharmaceutical compositions may be employed in conjunction with othertherapeutic compounds.

The pharmaceutical compositions may be administered in a convenientmanner such as by the oral, topical, intravenous, intraperitoneal,intramuscular, subcutaneous, intranasal, intra-anal or intradermalroutes. The pharmaceutical compositions are administered in an amountwhich is effective for treating and/or prophylaxis of the specificindication. In general, they are administered in an amount of at leastabout 10 g/kg body weight and in most cases they will be administered inan amount not in excess of about 8 mg/Kg body weight per day. In mostcases, the dosage is from about 10 g/kg to about 1 mg/kg body weightdaily, taking into account the routes of administration, symptoms, etc.

The breast specific genes and compounds which are polypeptides may alsobe employed in accordance with the present invention by expression ofsuch polypeptides in vivo, which is often referred to as “gene therapy.”

Thus, for example, cells from a patient may be engineered with apolynucleotide (DNA or RNA) encoding a polypeptide ex vivo, with theengineered cells then being provided to a patient to be treated with thepolypeptide. Such methods are well-known in the art. For example, cellsmay be engineered by procedures known in the art by use of a retroviralparticle containing RNA encoding a polypeptide of the present invention.

Similarly, cells may be engineered in vivo for expression of apolypeptide in vivo by, for example, procedures known in the art. Asknown in the art, a producer cell for producing a retroviral particlecontaining RNA encoding a polypeptide of the present invention may beadministered to a patient for engineering cells in vivo and expressionof the polypeptide in vivo. These and other methods for administering apolypeptide of the present invention by such method should be apparentto those skilled in the art from the teachings of the present invention.For example, the expression vehicle for engineering cells may be otherthan a retrovirus, for example, an adenovirus which may be used toengineer cells in vivo after combination with a suitable deliveryvehicle.

Retroviruses from which the retroviral plasmid vectors hereinabovementioned may be derived include, but are not limited to, Moloney MurineLeukemia Virus, spleen necrosis virus, retroviruses such as Rous SarcomaVirus, Harvey Sarcoma Virus, avian leukosis virus, gibbon ape leukemiavirus, human immunodeficiency virus, adenovirus, MyeloproliferativeSarcoma Virus, and mammary tumor virus. In one embodiment, theretroviral plasmid vector is derived from Moloney Murine Leukemia Virus.

The vector includes one or more promoters. Suitable promoters which maybe employed include, but are not limited to, the retroviral LTR; theSV40 promoter; and the human cytomegalovirus (CMV) promoter described inMiller, et al., Biotechniques, Vol. 7, No. 9, 980-990 (1989), or anyother promoter (e.g., cellular promoters such as eukaryotic cellularpromoters including, but not limited to, the histone, pol III, and-actin promoters). Other viral promoters which may be employed include,but are not limited to, adenovirus promoters, thymidine kinase (TK)promoters, and B19 parvovirus promoters. The selection of a suitablepromoter will be apparent to those skilled in the art from the teachingscontained herein.

The nucleic acid sequence encoding the polypeptide of the presentinvention is under the control of a suitable promoter. Suitablepromoters which may be employed include, but are not limited to,adenoviral promoters, such as the adenoviral major late promoter; orheterologous promoters, such as the cytomegalovirus (CMV) promoter; therespiratory syncytial virus (RSV) promoter; inducible promoters, such asthe MMT promoter, the metallothionein promoter; heat shock promoters;the albumin promoter; the ApoAI promoter; human globin promoters; viralthymidine kinase promoters, such as the Herpes Simplex thymidine kinasepromoter; retroviral LTRs (including the modified retroviral LTRshereinabove described); the -actin promoter; and human growth hormonepromoters. The promoter also may be the native promoter which controlsthe genes encoding the polypeptides.

The retroviral plasmid vector is employed to transduce packaging celllines to form producer cell lines. Examples of packaging cells which maybe transfected include, but are not limited to, the PE501, PA317, -2,-AM, PA12, T19-14X, VT-19-17-H2, CRE, CRIP, GP+E-86, GP+envAm12, and DANcell lines as described in Miller, Human Gene Therapy, Vol. 1, pgs. 5-14(1990), which is incorporated herein by reference in its entirety. Thevector may transduce the packaging cells through any means known in theart. Such means include, but are not limited to, electroporation, theuse of liposomes, and CaPO₄ precipitation. In one alternative, theretroviral plasmid vector may be encapsulated into a liposome, orcoupled to a lipid, and then administered to a host.

The producer cell line generates infectious retroviral vector particleswhich include the nucleic acid sequence(s) encoding the polypeptides.Such retroviral vector particles then may be employed, to transduceeukaryotic cells, either in vitro or in vivo. The transduced eukaryoticcells will express the nucleic acid sequence(s) encoding thepolypeptide. Eukaryotic cells which may be transduced include, but arenot limited to, embryonic stem cells, embryonic carcinoma cells, as wellas hematopoietic stem cells, hepatocytes, fibroblasts, myoblasts,keratinocytes, endothelial cells, and bronchial epithelial cells.

This invention is also related to the use of a breast specific genes ofthe present invention as a diagnostic. For example, some diseases resultfrom inherited defective genes. The breast specific genes, CSG7 andCSG10, for example, have been found to have a reduced expression inbreast cancer cells as compared to that in normal cells. Further, theremaining breast specific genes of the present invention areoverexpressed in breast cancer. Accordingly, a mutation in these genesallows a detection of breast disorders, for example, breast cancer. Amutation in a breast specific gene of the present invention at the DNAlevel may be detected by a variety of techniques. Nucleic acids used fordiagnosis (genomic DNA, mRNA, etc.) may be obtained from a patient'scells, other than from the breast, such as from blood, urine, saliva,tissue biopsy and autopsy material. The genomic DNA may be used directlyfor detection or may be amplified enzymatically by using PCR (Saiki, etal., Nature 324:163-166 (1986)) prior to analysis. RNA or cDNA may alsobe used for the same purpose. As an example, PCR primers complementaryto the nucleic acid of the instant invention can be used to identify andanalyze mutations in a breast specific polynucleotide of the presentinvention. For example, deletions and insertions can be detected by achange in size of the amplified product in comparison to the normalgenotype. Point mutations can be identified by hybridizing amplified DNAto radiolabelled breast specific RNA or, alternatively, radiolabelledantisense DNA sequences.

Another well-established method for screening for mutations inparticular segments of DNA after PCR amplification is single-strandconformation polymorphism (SSCP) analysis. PCR products are prepared forSSCP by ten cycles of reamplification to incorporate ³²P-dCTP, digestedwith an appropriate restriction enzyme to generate 200-300 bp fragments,and denatured by heating to 85 C for 5 min. and then plunged into ice.Electrophoresis is then carried out in a nondenaturing gel (5% glycerol,5% acrylamide) (Glavac, D. and Dean, M., Human Mutation, 2:404-414(1993)).

Sequence differences between the reference gene and “mutants” may berevealed by the direct DNA sequencing method. In addition, cloned DNAsegments may be used as probes to detect specific DNA segments. Thesensitivity of this method is greatly enhanced when combined with PCR.For example, a sequencing primer is used with double-stranded PCRproduct or a single-stranded template molecule generated by a modifiedPCR. The sequence determination is performed by conventional procedureswith radiolabeled nucleotides or by automatic sequencing procedures withfluorescent-tags.

Genetic testing based on DNA sequence differences may be achieved bydetection of alteration in electrophoretic mobility of DNA fragments andgels with or without denaturing agents. Small sequence deletions andinsertions can be visualized by high-resolution gel electrophoresis. DNAfragments of different sequences may be distinguished on denaturingformamide gradient gels in which the mobilities of different DNAfragments are retarded in the gel at different positions according totheir specific melting or partial melting temperatures (see, e.g.,Myers, et al., Science, 230:1242 (1985)). In addition, sequencealterations, in particular small deletions, may be detected as changesin the migration pattern of DNA.

Sequence changes at specific locations may also be revealed by nucleaseprotection assays, such as Rnase and S1 protection or the chemicalcleavage method (e.g., Cotton, et al., PNAS, USA, 85:4397-4401 (1985)).

Thus, the detection of the specific DNA sequence may be achieved bymethods such as hybridization, RNase protection, chemical cleavage,direct DNA sequencing, or the use of restriction enzymes (e.g.,Restriction Fragment Length Polymorphisms (RFLP)) and Southern blotting.

The sequences of the present invention are also valuable for chromosomeidentification. The sequence is specifically targeted to and canhybridize with a particular location on an individual human chromosome.Moreover, there is a current need for identifying particular sites onthe chromosome. Few chromosome marketing reagents based on actualsequence data (repeat polymorphisms) are presently available for markingchromosomal location. The mapping of DNAs to chromosomes according tothe present invention is an important first step in correlating thosesequences with genes associated with disease.

Briefly, sequences can be mapped to chromosomes by preparing PCR primers(preferably 15-25 bp) from the cDNA. Computer analysis of the 3′untranslated region is used to rapidly select primers that do not spanmore than one exon in the genomic DNA, thus complicating theamplification process. These primers are then used for PCR screening ofsomatic cell hybrids containing individual human chromosomes. Only thosehybrids containing the human gene corresponding to the primer will yieldan amplified fragment.

PCR mapping of somatic cell hybrids is a rapid procedure for assigning aparticular DNA to a particular chromosome. Using the present inventionwith the same oligonucleotide primers, sublocalization can be achievedwith panels of fragments from specific chromosomes or pools of largegenomic clones in an analogous manner. Other mapping strategies that cansimilarly be used to map to its chromosome include in situhybridization, prescreening with labeled flow-sorted chromosomes andpreselection by hybridization to construct chromosome specific-cDNAlibraries.

Fluorescence in situ hybridization (FISH) of a cDNA clone to a metaphasechromosomal spread can be used to provide a precise chromosomal locationin one step. This technique can be used with cDNA as short as 50 or 60bases. For a review of this technique, see Verma et al., HumanChromosomes: a Manual of Basic Techniques, Pergamon Press, New York(1988).

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. Such data are found, for example, in V. McKusick,Mendelian Inheritance in Man (available on line through Johns HopkinsUniversity Welch Medical Library). The relationship between genes anddiseases that have been mapped to the same chromosomal region are thenidentified through linkage analysis (coinheritance of physicallyadjacent genes).

Next, it is necessary to determine the differences in the cDNA orgenomic sequence between affected and unaffected individuals. If amutation is observed in some or all of the affected individuals but notin any normal individuals, then the mutation is likely to be thecausative agent of the disease.

With current resolution of physical mapping and genetic mappingtechniques, a cDNA precisely localized to a chromosomal regionassociated with the disease could be one of between 50 and 500 potentialcausative genes. (This assumes 1 megabase mapping resolution and onegene per 20 kb).

The polypeptides, their fragments or other derivatives, or analogsthereof, or cells expressing them can be used as an immunogen to produceantibodies thereto. These antibodies can be, for example, polyclonal ormonoclonal antibodies. The present invention also includes chimeric,single chain, and humanized antibodies, as well as Fab fragments, or theproduct of an Fab expression library. Various procedures known in theart may be used for the production of such antibodies and fragments.

Antibodies generated against the polypeptides corresponding to asequence of the present invention can be obtained by direct injection ofthe polypeptides into an animal or by administering the polypeptides toan animal, preferably a nonhuman. The antibody so obtained will thenbind the polypeptides itself. In this manner, even a sequence encodingonly a fragment of the polypeptides can be used to generate antibodiesbinding the whole native polypeptides. Such antibodies can then be usedto isolate the polypeptide from tissue expressing that polypeptide.

For preparation of monoclonal antibodies, any technique which providesantibodies produced by continuous cell line cultures can be used.Examples include the hybridoma technique (Kohler and Milstein, 1975,Nature, 256:495-497), the trioma technique, the human B-cell hybridomatechnique (Kozbor et al., 1983, Immunology Today 4:72), and theEBV-hybridoma technique to produce human monoclonal antibodies (Cole, etal., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss,Inc., pp. 77-96).

Techniques described for the production of single chain antibodies (U.S.Pat. No. 4,946,778) can be adapted to produce single chain antibodies toimmunogenic polypeptide products of this invention. Transgenic mice mayalso be used to generate antibodies.

The antibodies may also be employed to target breast cancer cells, forexample, in a method of homing interaction agents which, when contactingbreast cancer cells, destroy them. This is true since the antibodies arespecific for the breast specific polypeptides of the present invention.A linking of the interaction agent to the antibody would cause theinteraction agent to be carried directly to the breast.

Antibodies of this type may also be used to do in vivo imaging, forexample, by labeling the antibodies to facilitate scanning of the pelvicarea and the breast. One method for imaging comprises contacting anycancer cells of the breast to be imaged with an anti-breast specificprotein-antibody labeled with a detectable marker. The method isperformed under conditions such that the labeled antibody binds to thebreast specific polypeptides. In a specific example, the antibodiesinteract with the breast, for example, breast cancer cells, andfluoresce upon contact such that imaging and visibility of the breastare enhanced to allow a determination of the diseased or non-diseasedstate of the breast.

The present invention will be further described with reference to thefollowing examples; however, it is to be understood that the presentinvention is not limited to such examples. All parts or amounts, unlessotherwise specified, are by weight.

In order to facilitate understanding of the following examples certainfrequently occurring methods and/or terms will be described.

“Plasmids” are designated by a lower case p preceded and/or followed bycapital letters and/or numbers. The starting plasmids herein are eithercommercially available, publicly available on an unrestricted basis, orcan be constructed from available plasmids in accord with publishedprocedures. In addition, equivalent plasmids to those described areknown in the art and will be apparent to the ordinarily skilled artisan.

“Digestion” of DNA refers to catalytic cleavage of the DNA with arestriction enzyme that acts only at certain sequences in the DNA. Thevarious restriction enzymes used herein are commercially available andtheir reaction conditions, cofactors and other requirements were used aswould be known to the ordinarily skilled artisan. For analyticalpurposes, typically 1 μg of plasmid or DNA fragment is used with about 2units of enzyme in about 20 μl of buffer solution. For the purpose ofisolating DNA fragments for plasmid construction, typically 5 to 50 μgof DNA are digested with 20 to 250 units of enzyme in a larger volume.Appropriate buffers and substrate amounts for particular restrictionenzymes are specified by the manufacturer. Incubation times of about 1hour at 37 C are ordinarily used, but may vary in accordance with thesupplier's instructions. After digestion the reaction is electrophoreseddirectly on a polyacrylamide gel to isolate the desired fragment.

Size separation of the cleaved fragments is performed using 1 percentTAE agarose gel described by Sambrook, et al., “Molecular Cloning: ALaboratory Manual” Cold Spring Laboratory Press, (1989).

“Oligonucleotides” refers to either a single strandedpolydeoxynucleotide or two complementary polydeoxynucleotide strandswhich may be chemically synthesized. Such synthetic oligonucleotideshave no 5′ phosphate and thus will not ligate to another oligonucleotidewithout adding a phosphate with an ATP in the presence of a kinase. Asynthetic oligonucleotide will ligate to a fragment that has not beendephosphorylated.

“Ligation” refers to the process of forming phosphodiester bonds betweentwo double stranded nucleic acid fragments (Maniatis, T., et al., Id.,p. 146). Unless otherwise provided, ligation may be accomplished usingknown buffers and conditions with 10 units of T4 DNA ligase (“ligase”)per 0.5 μg of approximately equimolar amounts of the DNA fragments to beligated.

Unless otherwise stated, transformation was performed as described inthe method of Graham, F. and Van der Eb, A., Virology, 52:456-457(1973).

EXAMPLE 1 Determination of Transcription of a Breast Specific Gene

To assess the presence or absence of active transcription of a breastspecific gene RNA, approximately 6 ml of venous blood is obtained with astandard venipuncture technique using heparinized tubes. Whole blood ismixed with an equal volume of phosphate buffered saline, which is thenlayered over 8 ml of Ficoll (Pharmacia, Uppsala, Sweden) in a 15-mlpolystyrene tube. The gradient is centrifuged at 1800×g for 20 min at 5C. The lymphocyte and granulocyte layer (approximately 5 ml) iscarefully aspirated and rediluted up to 50 ml with phosphate-bufferedsaline in a 50-ml tube, which is centrifuged again at 1800×g for 20 min.at 5 C. The supernatant is discarded and the pellet containing nucleatedcells is used for RNA extraction using the RNazole B method as describedby the manufacturer (Tel-Test Inc., Friendswood, Tex.).

To determine the quantity of mRNA, a probe is designed with an identityto at least a portion of the mRNA sequence transcribed from a human genewhose coding portion includes a DNA sequence of FIGS. 1-20 (SEQ ID NO:1-2, 4, 6, 8-9, 10-24). This probe is mixed with the extracted RNA andthe mixed DNA and RNA are precipitated with ethanol −70 C for 15minutes). The pellet is resuspended in hybridization buffer anddissolved. The tubes containing the mixture are incubated in a 72 Cwater bath for 10-15 mins. to denature the DNA. The tubes are rapidlytransferred to a water bath at the desired hybridization temperature.Hybridization temperature depends on the G+C content of the DNA.Hybridization is done for 3 hrs. 0.3 ml of nuclease-S1 buffer is addedand mixed well. 50 l of 4.0 M ammonium acetate and 0.1 M EDTA is addedto stop the reaction. The mixture is extracted with phenol/chloroformand 20 g of carrier tRNA is added and precipitation is done with anequal volume of isopropanol. The precipitate is dissolved in 40 l of TE(pH 7.4) and run on an alkaline agarose gel. Following electrophoresis,the RNA is microsequenced to confirm the nucleotide sequence. (SeeFavaloro, J. et al., Methods Enzymol., 65:718 (1980) for a more detailedreview).

Two oligonucleotide primers are employed to amplify the sequenceisolated by the above methods. The 5 primer is 20 nucleotides long andthe 3 primer is a complimentary sequence for the 3 end of the isolatedmRNA. The primers are custom designed according to the isolated mRNA.The reverse transcriptase reaction and PCR amplification are performedsequentially without interruption in a Perkin Elmer 9600 PCR machine(Emeryville, Calif.). Four hundred ng total RNA in 20 ldiethylpyrocarbonate-treated water are placed in a 65 C water bath for 5min. and then quickly chilled on ice immediately prior to the additionof PCR reagents. The 50-1 total PCR volume consisted of 2.5 units Taqpolymerase (Perkin-Elmer). 2 units avian myeloblastosis virus reversetranscriptase (Boehringer Mannheim, Indianapolis, Ind.); 200 M each ofdCTP, dATP, dGTP and dTTP (Perkin Elmer); 18 pM each primer, 10 mMTris-HCl; 50 mM KCl; and 2 MM MgCl₂ (Perkin Elmer). PCR conditions areas follows: cycle 1 is 42 C for 15 min then 97 C for 15 s (1 cycle);cycle 2 is 95 C for 1 min. 60 C for 1 min, and 72 C for 30 s (15cycles); cycle 3 is 95 C for 1 min. 60 C for 1 min., and 72 C for 1 min.(10 cycles); cycle 4 is 95 C for 1 min., 60 C for 1 min., and 72 C for 2min. (8 cycles); cycle 5 is 72 C for 15 min. (1 cycle); and the finalcycle is a 4 C hold until sample is taken out of the machine. The 50-1PCR products are concentrated down to 10 l with vacuum centrifugation,and a sample is then run on a thin 1.2% Tris-borate-EDTA agarose gelcontaining ethidium bromide. A band of expected size would indicate thatthis gene is present in the tissue assayed. The amount of RNA in thepellet may be quantified in numerous ways, for example, it may beweighed.

Verification of the nucleotide sequence of the PCR products is done bymicrosequencing. The PCR product is purified with a Qiagen PCR ProductPurification Kit (Qiagen, Chatsworth, Calif.) as described by themanufacturer. One g of the PCR product undergoes PCR sequencing by usingthe Taq DyeDeoxy Tenninator Cycle sequencing kit in a Perlin-Elmer 9600PCR machine as described by Applied Biosystems (Foster, Calif.). Thesequenced product is purified using Centri-Sep columns (PrincetonSeparations, Adelphia, N.J.) as described by the company. This productis then analyzed with an ABI model 373A DNA sequencing system (AppliedBiosystems) integrated with a Macintosh IIci computer.

EXAMPLE 2 Bacterial Expression and Purification of the BSG Proteins andUse for Preparing a Monoclonal Antibody

The DNA sequence encoding a polypeptide of the present invention, forthis example BSG1, ATCC # 97175, is initially amplified using PCRoligonucleotide primers corresponding to the 5′ sequences of the proteinand the vector sequences 3′ to the protein. Additional nucleotidescorresponding to the DNA sequence are added to the 5′ and 3′ sequencesrespectively. The 5′ oligonucleotide primer has the sequence 5′GCCACCATGGATGTTTTCAAG 3′ (SEQ ID NO:25) and contains an NcoI restrictionenzyme site followed by 15 nucleotides of coding sequence starting fromthe initial amino acid of the processed protein. The 3′ sequence 5′GCGCAGATCTGTCTCCCCCACTCTGGGC 3′ (SEQ ID NO:26) and contains acomplementary sequence to a BgII restriction enzyme site and is followedby 18 nucleotides of the nucleic acid sequence encoding the protein. Therestriction enzyme sites correspond to the restriction enzyme sites on abacterial expression vector, pQE-60 (Qiagen, Inc. Chatsworth, Calif.).pQE-60 encodes antibiotic resistance (Amp′), a bacterial origin ofreplication (ori), an IPTG-regulatable promoter operator (P/O), aribosome binding site (RBS), a 6-His tag and restriction enzyme sites.pQE-60 is then digested with NcoI and BgIII. The amplified sequences areligated into pQE-60 and inserted in frame with the sequence encoding forthe histidine tag and the RBS. The ligation mixture is then used totransform an E. coli strain M 15/rep 4 (Qiagen) by the proceduredescribed in Sambrook, J. et al., Molecular Cloning: A LaboratoryManual, Cold Spring Laboratory Press, (1989). M15/rep4 contains multiplecopies of the plasmid pREP4, which expresses the lacI repressor and alsoconfers kanamycin resistance (Kan′). Transformants are identified bytheir ability to grow on LB plates and ampicillin/kanamycin resistantcolonies are selected. Plasmid DNA is isolated and confirmed byrestriction analysis.

Clones containing the desired constructs are grown overnight (O/N) inliquid culture in LB media supplemented with both Amp (100 ug/ml) andKan (25 ug/mil). The O/N culture is used to inoculate a large culture ata ratio of 1:100 to 1:250. The cells are grown to an optical density 600(O.D.⁶⁰⁰) of between 0.4 and 0.6. IPTG (“Isopropyl-B-D-thiogalactopyranoside”) is then added to a final concentration of 1 mM. IPTGinduces by inactivating the lacI repressor, clearing the P/O leading toincreased gene expression. Cells are grown an extra 3 to 4 hours. Cellsare then harvested by centrifugation. The cell pellet is solubilized inthe chaotropic agent 6 Molar Guanidine HCl. After clarification,solubilized protein is purified from this solution by chromatography ona Nickel-Chelate column under conditions that allow for tight binding byproteins containing the 6-His tag (Hochuli, E. et al., J. Chromatography411:177-184 (1984)). BSG1 protein (>90% pure) is eluted from the columnin 6 molar guanidine HCl pH 5.0 and for the purpose of renaturationadjusted to 3 molar guanidine HCl, 100 mM sodium phosphate, 10 mmolarglutathione (reduced) and 2 mmolar glutathione (oxidized). Afterincubation in this solution for 12 hours the protein is dialyzed to 10mmolar sodium phosphate.

The protein purified in this manner may be used as an epitope to raisemonoclonal antibodies specific to such protein. The monoclonalantibodies generated against the polypeptide the isolated protein can beobtained by direct injection of the polypeptides into an animal or byadministering the polypeptides to an animal. The antibodies so obtainedwill then bind to the protein itself. Such antibodies can then be usedto isolate the protein from tissue expressing that polypeptide by theuse of an, for example, ELISA assay.

EXAMPLE 3 Preparation of cDNA Libraries from Breast Tissue

Total cellular RNA is prepared from tissues by the guanidinium-phenolmethod as previously described (P. Chomczynski and N. Sacchi, Anal.Biochem., 162: 156-159 (1987)) using RNAzol (Cinna-Biotecx). Anadditional ethanol precipitation of the RNA is included. Poly A mRNA isisolated from the total RNA using oligo dT-coated latex beads (Qiagen).Two rounds of poly A selection are performed to ensure better separationfrom non-polyadenylated material when sufficient quantities of total RNAare available.

The mRNA selected on the oligo dT is used for the synthesis of cDNA by amodification of the method of Gobbler and Hoffman (Gobbler, U. and B. J.Hoffman, 1983, Gene, 25:263). The first strand synthesis is performedusing either Moloney murine sarcoma virus reverse transcriptase(Stratagene) or Superscript II (RNase H minus Moloney murine reversetranscriptase, Gibco-BRL). First strand synthesis is primed using aprimer/linker containing an Xho I restriction site. The nucleotide mixused in the synthesis contains methylated dCTP to prevent restrictionwithin the cDNA sequence. For second-strand synthesis E. coli polymeraseKlenow fragment is used and [³²P]-dATP is incorporated as a tracer ofnucleotide incorporation.

Following 2nd strand synthesis, the cDNA is made blunt ended usingeither T4 DNA polymerase or Klenow fragment. Eco RI adapters are addedto the cDNA and the cDNA is restricted with Xho I. The cDNA is sizefractionated over a Sephacryl S-500 column (Pharmacia) to remove excesslinkers and cDNAs under approximately 500 base pairs.

The cDNA is cloned unidirectionally into the Eco RI-Xho I sites ofeither pBluescript II phagemid or lambda Uni-zap XR (Stratagene). In thecase of cloning into pBluescript II, the plasmids are electroporatedinto E. coli SURE competent cells (Stratagene). When the cDNA is clonedinto Uni-Zap XR it is packaged using the Gigipack II packaging extract(Stratagene). The packaged phage is used to infect SURE cells andamplified. The pBluescript phagemid containing the cDNA inserts areexcised from the lambda Zap phage using the helper phage ExAssist(Stratagene). The rescued phagemid is plated on SOLR E. coli cells(Stratagene).

Preparation of Sequencing Templates

Template DNA for sequencing is prepared by 1) a boiling method or 2) PCRamplification.

The boiling method is a modification of the method of Holmes and Quigley(Holmes, D. S. and M. Quigley, 1981, Anal. Biochem., 114:193). Coloniesfrom either cDNA cloned into Bluescript II or rescued Bluescriptphagemid are grown in an enriched bacterial media overnight. 400 μl ofcells are centrifuged and resuspended in STET (0.1M NaCl, 10 mM TRIS Ph8.0, 1.0 mM EDTA and 5% Triton X-100) including lysozyme (80 μg/ml) andRNase A (4 μg/ml). Cells are boiled for 40 seconds and centrifuged for10 minutes. The supernatant is removed and the DNA is precipitated withPEG/NaCl and washed with 70% ethanol (2×). Templates are resuspended inwater at approximately 250 ng/μl.

Preparation of templates by PCR is a modification of the method ofRosenthal et al. (Rosenthal, et al., Nucleic Acids Res., 1993,21:173-174). Colonies containing cDNA cloned into pBluescript II orrescued pBluescript phagemid are grown overnight in LB containingampicillin in a 96 well tissue culture plate. Two μl of the cultures areused as template in a PCR reaction (Saiki, R K, et al., Science,239:487-493, 1988; and Saiki, R K, et al., Science, 230:1350-1354, 1985)using a tricine buffer system (Ponce and Micol., Nucleic Acids Res.,1992, 20:1992.) and 200 μM dNTPs.

The primer set chosen for amplification of the templates is outside ofprimer sites chosen for sequencing of the templates. The primers usedare 5′-ATGCTTCCGGCTCGTATG-3′ (SEQ ID NO:27) which is 5′ of the M13reverse sequence in pBluescript and 5′ -GGGTTTCCCAGTCACGAC-3′ (SEQ IDNO:28) which is 3′ of the M13 forward primer in pBluescript. Any primerswhich correspond to the sequence flanking the M13 forward and reversesequences can be used. Perkin-Elmer 9600 thermocyclers are used foramplification of the templates with the following cycler conditions: 5min at 94 C (1 cycle); (20 sec at 94 C); 20 sec at 55 C (1 min at 72 C)(30 cycles); 7 min at 72 C (1 cycle). Following amplification the PCRtemplates are precipitated using PEG/NaCl and washed three times with70% ethanol. The templates are resuspended in water.

EXAMPLE 4 Isolation of a Selected Clone From Breast Tissue

Two approaches are used to isolate a particular clone from a cDNAlibrary prepared from human breast tissue.

In the first, a clone is isolated directly by screening the libraryusing an oligonucleotide probe. To isolate a particular clone, aspecific oligonucleotide with 30-40 nucleotides is synthesized using anApplied Biosystems DNA synthesizer according to one of the partialsequences described in this application. The oligonucleotide is labeledwith ³²P-ATP using T4 polynucleotide kinase and purified according tothe standard protocol (Maniatis et al., Molecular Cloning: A LaboratoryManual, Cold Spring Harbor Press, Cold Spring, N.Y., 1982). The LambdacDNA library is plated on 1.5% agar plate to a density of 20,000-50,000pfu/150 mm plate. These plates are screened using Nylon membranesaccording to the standard phage screening protocol (Stratagene, 1993).Specifically, the Nylon membrane with denatured and fixed phage DNA isprehybridized in 6×SSC, 20 mM NaH₂PO₄, 0.4% SDS, 5× Denhardt's 500 μg/mldenatured, sonicated salmon sperm DNA; and 6×SSC, 0.1% SDS. After onehour of prehybridization, the membrane is hybridized with hybridizationbuffer 6×SSC, 20 mM NaH₂PO₄, 0.4% SDS, 500 μg/ml denatured, sonicatedsalmon sperm DNA with 1×10⁶ cpm/ml ³²P-probe overnight at 42 C. Themembrane is washed at 45-50 C with washing buffer 6×SSC, 0.1% SDS for20-30 minutes dried and exposed to Kodak X-ray film overnight. Positiveclones are isolated and purified by secondary and tertiary screening.The purified clone sequenced to verify its identity to the partialsequence described in this application.

An alternative approach to screen the cDNA library prepared from humanbreast tissue is to prepare a DNA probe corresponding to the entirepartial sequence. To prepare a probe, two oligonucleotide primers of17-20 nucleotides derived from both ends of the partial sequencereported are synthesized and purified. These two oligonucleotides areused to amplify the probe using the cDNA library template. The DNAtemplate is prepared from the phage lysate of the cDNA library accordingto the standard phage DNA preparation protocol (Maniatis et al.). Thepolymerase chain reaction is carried out in 25 μl reaction mixture with0.5 μg of the above cDNA template. The reaction mixture is 1.5-5 mMMgCl₂, 0.01% (w/v) gelatin, 20 μM each of dATP, dCTP, dGTP, dCTP, 25pmol of each primer and 0.25 Unit of Taq polymerase. Thirty five cyclesof PCR (denaturation at 94 C for 1 min; annealing at 55 C for 1 min;elongation at 72 C for 1 min) are performed with the Perkin-Elmer Cetusautomated thermal cycler. The amplified product is analyzed by agarosegel electrophoresis and the DNA band with expected molecular weight isexcised and purified. The PCR product is verified to be the probe bysubcloning and sequencing the DNA product. The probe is labeled with theMultiprime DNA Labelling System (Amersham) at a specific activity <1×10⁹dmp/μg. This probe is used to screen the lambda cDNA library accordingto Stratagene's protocol. Hybridization is carried out with 5×TEN920×TEN:0.3M Tris-HCl pH 8.0, 0.02M EDTA and 3MNaCl), 5× Denhardt's,0.5% sodium pyrophosphate, 0.1% SDS, 0.2 mg/ml heat denatured salmonsperm DNA and 1×10⁶ cpm/ml of [³²P]-labeled probe at 55 C for 12 hours.The filters are washed in 0.5×TEN at room temperature for 20-30 min.,then at 55 C for 15 min. The filters are dried and autoradiographed at−70 C using Kodak XAR-5 film. The positive clones are purified bysecondary and tertiary screening. The sequence of the isolated clone areverified by DNA sequencing.

General procedures for obtaining complete sequences from partialsequences described herein are summarized as follows;

Procedure 1

Selected human DNA from the partial sequence clone (the cDNA clone thatwas sequenced to give the partial sequence) is purified e.g., byendonuclease digestion using Eco-R1, gel electrophoresis, and isolationof the clone by removal from low melting agarose gel. The isolatedinsert DNA, is radiolabeled e.g., with ³²P labels, preferably by nicktranslation or random primer labeling. The labeled insert is used as aprobe to screen a lambda phage cDNA library or a plasmid cDNA library.Colonies containing clones related to the probe cDNA are identified andpurified by known purification methods. The ends of the newly purifiedclones are nucleotide sequenced to identify full length sequences.Complete sequencing of full length clones is then performed byExonuclease III digestion or primer walking. Northern blots of the mRNAfrom various tissues using at least part of the deposited clone fromwhich the partial sequence is obtained as a probe can optionally beperformed to check the size of the mRNA against that of the purportedfull length cDNA.

The following procedures 2 and 3 can be used to obtain full length genesor full length coding portions of genes where a clone isolated from thedeposited clone mixture does not contain a full length sequence. Alibrary derived from human breast tissue or from the deposited clonemixture is also applicable to obtaining full length sequences fromclones obtained from sources other than the deposited mixture by use ofthe partial sequences of the present invention.

Procedure 2

RACE Protocol for Recovery of Full-Length Genes

Partial cDNA clones can be made full-length by utilizing the rapidamplification of cDNA ends (RACE) procedure described in Frohman, M. A.,Dush, M. K. and Martin, G. R. (1988) Proc. Nafl. Acad. Sci. USA,85:8998-9002. A cDNA clone missing either the 5′ or 3′ end can bereconstructed to include the absent base pairs extending to thetranslational start or stop codon, respectively. In most cases, cDNAsare missing the start of translation therefor. The following brieflydescribes a modification of this original 5′ RACE procedure. Poly A+ ortotal RNA is reverse transcribed with Superscript II (Gibco/BRL) and anantisense or complementary primer specific to the cDNA sequence. Theprimer is removed from the reaction with a Microcon Concentrator(Amicon). The first-strand cDNA is then tailed with dATP and terminaldeoxynucleotide transferase (Gibco/BRL). Thus, an anchor sequence isproduced which is needed for PCR amplification. The second strand issynthesized from the dA-tail in PCR buffer, Taq DNA polymerase(Perkin-Elmer Cetus), an oligo-dT primer containing three adjacentrestriction sites XhoI, SalI and ClaI) at the 5′ end and a primercontaining just these restriction sites. This double-stranded cDNA isPCR amplified for 40 cycles with the same primers as well as a nestedcDNA-specific antisense primer. The PCR products are size-separated onan ethidium bromide-agarose gel and the region of gel containing cDNAproducts the predicted size of missing protein-coding DNA is removed.cDNA is purified from the agarose with the Magic PCR Prep kit (Promega),restriction digested with XhoI or SalI, and ligated to a plasmid such aspBluescript SKI (Stratagene) at ShoI and EcoRV sites. This DNA istransformed into bacteria and the plasmid clones sequenced to identifythe correct protein-coding inserts. Correct 5′ ends are confirmed bycomparing this sequence with the putatively identified homologue andoverlap with the partial cDNA clone.

Several quality-controlled kits are available for purchase. Similarreagents and methods to those above are supplied in kit form fromGibco/BRL. A second kit is available from Clontech which is amodification of a related technique, SLIC (single-stranded ligation tosingle-stranded cDNA) developed by Dumas et al. (Dumas, J. B., Edwards,M., Delort, J. and Mallet, Jr., 1991, Nucleic Acids Res., 19:5227-5232).The major differences in procedure are that the RNA is alkalinehydrolyzed after reverse transcription and RNA ligase is used to join arestriction site-containing anchor primer to the first-strand cDNA. Thisobviates the necessity for the dA-tailing reaction which results in apolyT stretch that is difficult to sequence past.

An alternative to generating 5′ cDNA from RNA is to use cDNA librarydouble-stranded DNA. An asymmetric PCR-amplified antisense cDNA strandis synthesized with an antisense cDNA-specific primer and aplasmid-anchored primer. These primers are removed and a symmetric PCRreaction is performed with a nested cDNA-specific antisense primer andthe plasmid-anchored primer.

Procedure 3

RNA Ligase Protocol for Generating the 5′ End Sequences to Obtain FullLength Genes

Once a gene of interest is identified, several methods are available forthe identification of the 5′ or 3′ portions of the gene which may not bepresent in the original deposited clone. These methods include but arenot limited to filter probing, clone enrichment using specific probesand protocols similar and identical to 5′ and 3′ RACE. While the fulllength gene may be present in a library and can be identified byprobing, a useful method for generating the 5′ end is to use theexisting sequence information from the original partial sequence togenerate the missing information. A method similar to 5′ RACE isavailable for generating the missing 5′ end of a desired full-lengthgene. (This method was published by Fromont-Racine et al, Nucleic AcidsRes., 21(7):1683-1684 (1993). Briefly, a specific RNA oligonucleotide isligated to the 5′ ends of a population of RNA presumably containingfull-length gene RNA transcript and a primer set containing a primerspecific to the ligated RNA oligonucleotide. A primer specific to aknown sequence (EST) of the gene of interest is used to PCR amplify the5′ portion of the desired full length gene which may then be sequencedand used to generate the full length gene. This method starts with totalRNA isolated from the desired source, poly A RNA may be used but is nota prerequisite for this procedure. The RNA preparation may then betreated with phosphatase if necessary to eliminate 5′ phosphate groupson degraded or damaged RNA which may interfere with the later RNA ligasestep. The phosphatase if used is then inactivated and the RNA is treatedwith tobacco acid pyrophosphatase in order to remove the cap structurepresent at the 5′ ends of messenger RNAs. This reaction leaves a 5′phosphate group at the 5′ end of the cap-cleaved RNA which can then beligated to an RNA oligonucleotide using T4 RNA ligase. This modified RNApreparation can then be used as a template for first strand cDNAsynthesis using a gene-specific oligonucleotide. The first standsynthesis reaction can then be used as a template for PCR amplificationof the desired 5′ end using a primer specific to the ligated RNAoligonucleotide and a primer specific to the known sequence (EST) of thegene of interest. The resultant product is then sequenced and analyzedto confirm that the 5′ end sequence belongs to the partial sequence.

EXAMPLE 5 Cloning and Expression of BSG1 Using the BaculovirusExpression System

The DNA sequence encoding the full length BSG1 protein, ATCC # 97175,was amplified using PCR oligonucleotide primers corresponding to the 5′and 3′ sequences of the gene:

The 5′ primer has the sequence 5′ AAAGGATCCCCCGCCATCATGGATGTTTCAAGAAG 3′(SEQ ID NO:29) and contains a BamHI restriction enzyme site (in bold)followed by 8 nucleotides resembling an efficient signal for theinitiation of translation in eukaryotic cells (Kozak, M., J. Mol. Biol.,196:947-950 (1987) of the BSG1 gene (the initiation codon fortranslation “ATG” is underlined).

The 3′ primer has the sequence 5′ AAATCTAGACTAGTCTCCCCCACTCTG 3′ (SEQ IDNO:30) and contains the cleavage site for the restriction endonucleaseXbaI and 21 nucleotides complementary to the 3′ sequence of the BSGIgene. The amplified sequences were isolated from a 1% agarose gel usinga commercially available kit (“Geneclean,” BIO 101 Inc., La Jolla,Calif.). The fragment was then digested with the endonucleases BamHI andXbal and then purified again on a 1% agarose gel. This fragment isdesignated F2.

The vector pA2 (modification of pVL941 vector, discussed below) is usedfor the expression of the BSG1 protein using the baculovirus expressionsystem (for review see: Summers, M. D. and Smith, G. E. 1987, A manualof methods for baculovirus vectors and insect cell culture procedures,Texas Agricultural Experimental Station Bulletin No. 1555). Thisexpression vector contains the strong polyhedrin promoter of theAutographa californica nuclear polyhedrosis virus (AcMNPV) followed bythe recognition sites for the restriction endonucleases BamHI and XbaI.The polyadenylation site of the simian virus (SV)40 is used forefficient polyadenylation. For an easy selection of recombinant virusthe beta-galactosidase gene from E. coli is inserted in the sameorientation as the polyhedrin promoter followed by the polyadenylationsignal of the polyhedrin gene. The polyhedrin sequences are flanked atboth sides by viral sequences for the cell-mediated homologousrecombination of co-transfected wild-type viral DNA. Many otherbaculovirus vectors could be used in place of pA2 such as pRG1, pAc373,pVL941 and pAcIMI (Luckow, V. A. and Summers, M. D., Virology,170:31-39).

The plasmid was digested with the restriction enzymes BamHI and XbaI anddephosphorylated using calf intestinal phosphatase by procedures knownin the art. The DNA was then isolated from a 1% agarose gel using thecommercially available kit (“Geneclean” BIO 101 Inc., La Jolla, Calif.).This vector DNA is designated V2.

Fragment F2 and the dephosphorylated plasmid pA2 were ligated with T4DNA ligase. E. coli HB101 cells were then transformed and bacteriaidentified that contained the plasmid (pBacBSG1) with the BSG1 geneusing the enzymes BamHI and XbaI. The sequence of the cloned fragmentwas confirmed by DNA sequencing.

5 μg of the plasmid pBacBSG1 was co-transfected with 1.0 μg of acommercially available linearized baculovirus (“BaculoGold™ baculovirusDNA”, Pharmingen, San Diego, Calif.) using the lipofection method(Felgner et al. Proc. Natl. Acad. Sci. USA, 84:7413-7417 (1987)).

1 μg of BaculoGold™ virus DNA and 5 μg of the plasmid pBacBSG1 weremixed in a sterile well of a microtiter plate containing 50 μl of serumfree Grace's medium (Life Technologies Inc., Gaithersburg, Md.).Afterwards 10 μl Lipofectin plus 90 μl Grace's medium were added, mixedand incubated for 15 minutes at room temperature. Then the transfectionmixture was added drop-wise to the Sf9 insect cells (ATCC CRL 1711)seeded in a 35 mm tissue culture plate with 1 ml Grace's medium withoutserum. The plate was rocked back and forth to mix the newly addedsolution. The plate was then incubated for 5 hours at 27 C. After 5hours the transfection solution was removed from the plate and 1 ml ofGrace's insect medium supplemented with 10% fetal calf serum was added.The plate was put back into an incubator and cultivation continued at 27C for four days.

After four days the supernatant was collected and a plaque assayperformed similar as described by Summers and Smith (supra). As amodification an agarose gel with “Blue Gal” (Life Technologies Inc.,Gaithersburg) was used which allows an easy isolation of blue stainedplaques. (A detailed description of a “plaque assay” can also be foundin the user's guide for insect cell culture and baculovirologydistributed by Life Technologies Inc., Gaithersburg, page 9-10).

Four days after the serial dilution, the virus was added to the cellsand blue stained plaques were picked with the tip of an Eppendorfpipette. The agar containing the recombinant viruses was thenresuspended in an Eppendorf tube containing 200 μl of Grace's medium.The agar was removed by a brief centrifugation and the supernatantcontaining the recombinant baculovirus was used to infect Sf9 cellsseeded in 35 mm dishes. Four days later the supernatants of theseculture dishes were harvested and then stored at 4 C.

Sf9 cells were grown in Grace's medium supplemented with 10%heat-inactivated FBS. The cells were infected with the recombinantbaculovirus V-BSG1 at a multiplicity of infection (MOI) of 2. Six hourslater the medium was removed and replaced with SF900 II medium minusmethionine and cysteine (Life Technologies Inc., Gaithersburg). 42 hourslater 5 μCi of ³⁵S-methionine and 5 μCi 35S cysteine (Amersham) wereadded. The cells were further incubated for 16 hours before they wereharvested by centrifugation and the labelled proteins visualized bySDS-PAGE and autoradiography.

EXAMPLE 6 Expression of Recombinant BSG1 in COS cells

The expression of plasmid, BSG1 HA is derived from a vector pcDNAI/Amp(Invitrogen) containing: 1) SV40 origin of replication, 2) ampicillinresistance gene, 3) E. coli replication origin, 4) CMV promoter followedby a polylinker region, an SV40 intron and polyadenylation site. A DNAfragment encoding the entire precursor and a HA tag fused in frame toits 3′ end was cloned into the polylinker region of the vector,therefore, the recombinant protein expression is directed under the CMVpromoter. The HA tag corresponds to an epitope derived from theinfluenza hemagglutinin protein as previously described (I. Wilson, H.Niman, R. Heighten, A Cherenson, M. Connolly, and R. Lerner, 1984, Cell37:767, (1984)). The infusion of HA tag to the target protein allowseasy detection of the recombinant protein with an antibody thatrecognizes the HA epitope.

The plasmid construction strategy is described as follows:

The DNA sequence encoding BSG1, ATCC # 97175, was constructed by PCRusing two primers: the 5′ primer AAAGGATCCCCCGCCATCATGGATGTTCAAGAAG 3′(SEQ ID NO:29) contains a BamHI site followed by 18 nucleotides of BSG1coding sequence starting from the initiation codon; the 3′ sequenceAAATCTAGACTAAAGCGTAGTCTGGGACGTCGTATGGGTACTCCTGGGGTCTCCCCCACTCTGGGC 3′(SEQ ID NO:31) contains complementary sequences to an XbaI site,translation stop codon, HA tag and the last 18 nucleotides of the BamHIcoding sequence (not including the stop codon). Therefore, the PCRproduct contains an BamHI site, BSG1 coding sequence followed by HA tagfused in frame, a translation termination stop codon next to the HA tag,and an Xbal site. The PCR amplified DNA fragment and the vector,pcDNAI/Amp, were digested with BamHI and Xbal restriction enzyme andligated. The ligation mixture was transformed into E. coli strain SURE(available from Stratagene Cloning Systems, 11099 North Torrey PinesRoad, La Jolla, Calif. 92037) the transformed culture was plated onampicillin media plates and resistant colonies were selected. PlasmidDNA was isolated from transformants and examined by restriction analysisfor the presence of the correct fragment. For expression of therecombinant BSG protein, COS cells were transfected with the expressionvector by DEAE-DEXTRAN method (J. Sambrook, E. Fritsch, T. Maniatis,Molecular Cloning: A Laboratory Manual, Cold Spring Laboratory Press,(1989)). The expression of the BSG HA protein was detected byradiolabelling and immunoprecipitation method (E. Harlow, D. Lane,Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press,(1988)). Cells were labelled for 8 hours with ³⁵S-cysteine two days posttransfection. Culture media was then collected and cells were lysed withdetergent (RIPA buffer (150 mM NaCl, 1% NP-40, 0.1% SDS, 1% NP-40, 0.5%DOC, 50 mM Tris, pH 7.5) (Wilson, I. et al., Id. 37:767 (1984)). Bothcell lysate and culture media were precipitated with an HA specificmonoclonal antibody. Proteins precipitated were analyzed on 15% SDS-PAGEgels.

Numerous modifications and variations of the present invention arepossible in light of the above teachings and, therefore, within thescope of the appended claims, the invention may be practiced otherwisethan as particularly described.

1. An isolated polynucleotide comprising a member selected from thegroup consisting of: (a) a polynucleotide encoding the same polypeptideas the polynucleotide of FIG. 1 (SEQ ID NO: 1); (b) a polynucleotideencoding the same mature polypeptide as a human gene having a codingportion which includes DNA having at least a 90% identity to the DNA ofone of FIGS. 2-20 (SEQ ID NO:2, 4, 6, 8, 9, 11-24); (c) a polynucleotidewhich hybridizes to the polynucleotide of (a) and which has at least a70% identity thereto; and (d) a polynucleotide encoding the same maturepolypeptide as a human gene having a coding portion which includes DNAhaving at least a 90% identity to a DNA included in the deposited clone.2. The polynucleotide of claim 1 wherein the human gene includes DNAcontained in the deposited clone.
 3. The polynucleotide of claim 1wherein the member is a polynucleotide encoding the same polypeptide asthe polynucleotide of FIG. 1 (SEQ ID NO:1).
 4. A vector containing thepolynucleotide of claim
 1. 5. A host cell transformed or transfectedwith the vector of claim
 4. 6. A process for producing cells capable ofexpressing a polypeptide comprising genetically engineering cells withthe vector of claim
 4. 7. A process for producing a polypeptidecomprising: expressing from the host cell of claim 5 the polypeptideencoded by said polynucleotide.
 8. A polypeptide comprising a memberselected from the group consisting of: (i) a polypeptide encoded by ahuman gene, said human gene having a coding portion whose DNA has atleast a 90% identity to the DNA of one of FIGS. 2-20 (SEQ ID NO:2, 4, 6,8, 9, 11-24); (ii) a polypeptide encoded by the polynucleotide of FIG. 1(SEQ ID NO: 1) and fragments, analogs and derivatives thereof; and (iii)a polypeptide encoded by the human gene whose coding region includes aDNA having at least a 90% identity to the DNA contained in the depositedclone and fragments, analogs and derivatives of said polypeptide.
 9. Thepolypeptide of claim 8 wherein the polypeptide is encoded by thepolynucleotide having a sequence as set forth in FIG. 1 (SEQ ID NO: 1).10. An antibody against the polypeptide of claim
 8. 11. A compound whichinhibits activation of the polypeptide of claim
 8. 12. A method for thetreatment of a patient having need to inhibit a breast specific geneprotein comprising: administering to the patient a therapeuticallyeffective amount of the compound of claim
 11. 13. The method of claim 12wherein the compound is a polypeptide and the therapeutically effectiveamount of the compound is administered by providing to the patient DNAencoding said polypeptide and expressing said polypeptide in vivo.
 14. Amethod for the treatment of a patient having need of a breast specificgene protein comprising: administering to the patient a therapeuticallyeffective amount of the polypeptide of claim
 8. 15. A process fordiagnosing a disorder of the breast in a host comprising: determiningtranscription of a human gene in a sample derived from non-breast tissueof a host, said gene having a coding portion which includes DNA havingat least 90% identity to DNA selected from the group consisting of theDNA of FIGS. 1-20 (SEQ ID NO:2, 4, 6, 8, 9, 11-24), whereby saidtranscription indicates a disorder of the breast in the host.
 16. Theprocess of claim 15 wherein transcription is determined by detecting thepresence of an altered level of RNA transcribed from said human gene.17. The process of claim 15 wherein transcription is determined bydetecting the presence of an altered level of DNA complementary to theRNA transcribed from said human gene.
 18. The process of claim 15wherein transcription is determined by detecting the presence of analtered level of an expression product of said human gene.
 19. A processfor determining a disorder of a breast in a host comprising: contactingthe antibody of claim 10 to a fluid sample derived from a host;determining the presence of an altered level of a BSG gene product insaid sample.
 20. A process for identifying antagonists to thepolypeptide of claim 8 comprising: contacting said polypeptide with anatural substrate and a labeled compound to be screened eithersimultaneously or in either consecutive order; and determining whetherthe therapeutic effectively competes with the natural substrate in amanner sufficient to prevent binding of the protein to its substrate.